31 research outputs found

    Heterogeneous graph learning for explainable recommendation over academic networks

    Get PDF
    With the explosive growth of new graduates with research degrees every year, unprecedented challenges arise for early-career researchers to find a job at a suitable institution. This study aims to understand the behavior of academic job transition and hence recommend suitable institutions for PhD graduates. Specifically, we design a deep learning model to predict the career move of early-career researchers and provide suggestions. The design is built on top of scholarly/academic networks, which contains abundant information about scientific collaboration among scholars and institutions. We construct a heterogeneous scholarly network to facilitate the exploring of the behavior of career moves and the recommendation of institutions for scholars. We devise an unsupervised learning model called HAI (Heterogeneous graph Attention InfoMax) which aggregates attention mechanism and mutual information for institution recommendation. Moreover, we propose scholar attention and meta-path attention to discover the hidden relationships between several meta-paths. With these mechanisms, HAI provides ordered recommendations with explainability. We evaluate HAI upon a real-world dataset against baseline methods. Experimental results verify the effectiveness and efficiency of our approach. © 2021 ACM

    Matching algorithms : fundamentals, applications and challenges

    Get PDF
    Matching plays a vital role in the rational allocation of resources in many areas, ranging from market operation to people's daily lives. In economics, the term matching theory is coined for pairing two agents in a specific market to reach a stable or optimal state. In computer science, all branches of matching problems have emerged, such as the question-answer matching in information retrieval, user-item matching in a recommender system, and entity-relation matching in the knowledge graph. A preference list is the core element during a matching process, which can either be obtained directly from the agents or generated indirectly by prediction. Based on the preference list access, matching problems are divided into two categories, i.e., explicit matching and implicit matching. In this paper, we first introduce the matching theory's basic models and algorithms in explicit matching. The existing methods for coping with various matching problems in implicit matching are reviewed, such as retrieval matching, user-item matching, entity-relation matching, and image matching. Furthermore, we look into representative applications in these areas, including marriage and labor markets in explicit matching and several similarity-based matching problems in implicit matching. Finally, this survey paper concludes with a discussion of open issues and promising future directions in the field of matching. © 2017 IEEE. **Please note that there are multiple authors for this article therefore only the name of the first 5 including Federation University Australia affiliate “Jing Ren, Xia Feng, Nargiz Sultanova" is provided in this record*

    MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation

    Full text link
    We present MosaicFusion, a simple yet effective diffusion-based data augmentation approach for large vocabulary instance segmentation. Our method is training-free and does not rely on any label supervision. Two key designs enable us to employ an off-the-shelf text-to-image diffusion model as a useful dataset generator for object instances and mask annotations. First, we divide an image canvas into several regions and perform a single round of diffusion process to generate multiple instances simultaneously, conditioning on different text prompts. Second, we obtain corresponding instance masks by aggregating cross-attention maps associated with object prompts across layers and diffusion time steps, followed by simple thresholding and edge-aware refinement processing. Without bells and whistles, our MosaicFusion can produce a significant amount of synthetic labeled data for both rare and novel categories. Experimental results on the challenging LVIS long-tailed and open-vocabulary benchmarks demonstrate that MosaicFusion can significantly improve the performance of existing instance segmentation models, especially for rare and novel categories. Code will be released at https://github.com/Jiahao000/MosaicFusion.Comment: GitHub: https://github.com/Jiahao000/MosaicFusio

    Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation

    Full text link
    In this work, we focus on open vocabulary instance segmentation to expand a segmentation model to classify and segment instance-level novel categories. Previous approaches have relied on massive caption datasets and complex pipelines to establish one-to-one mappings between image regions and words in captions. However, such methods build noisy supervision by matching non-visible words to image regions, such as adjectives and verbs. Meanwhile, context words are also important for inferring the existence of novel objects as they show high inter-correlations with novel categories. To overcome these limitations, we devise a joint \textbf{Caption Grounding and Generation (CGG)} framework, which incorporates a novel grounding loss that only focuses on matching object nouns to improve learning efficiency. We also introduce a caption generation head that enables additional supervision and contextual modeling as a complementation to the grounding loss. Our analysis and results demonstrate that grounding and generation components complement each other, significantly enhancing the segmentation performance for novel classes. Experiments on the COCO dataset with two settings: Open Vocabulary Instance Segmentation (OVIS) and Open Set Panoptic Segmentation (OSPS) demonstrate the superiority of the CGG. Specifically, CGG achieves a substantial improvement of 6.8% mAP for novel classes without extra data on the OVIS task and 15% PQ improvements for novel classes on the OSPS benchmark.Comment: ICCV-202

    OV-VG: A Benchmark for Open-Vocabulary Visual Grounding

    Full text link
    Open-vocabulary learning has emerged as a cutting-edge research area, particularly in light of the widespread adoption of vision-based foundational models. Its primary objective is to comprehend novel concepts that are not encompassed within a predefined vocabulary. One key facet of this endeavor is Visual Grounding, which entails locating a specific region within an image based on a corresponding language description. While current foundational models excel at various visual language tasks, there's a noticeable absence of models specifically tailored for open-vocabulary visual grounding. This research endeavor introduces novel and challenging OV tasks, namely Open-Vocabulary Visual Grounding and Open-Vocabulary Phrase Localization. The overarching aim is to establish connections between language descriptions and the localization of novel objects. To facilitate this, we have curated a comprehensive annotated benchmark, encompassing 7,272 OV-VG images and 1,000 OV-PL images. In our pursuit of addressing these challenges, we delved into various baseline methodologies rooted in existing open-vocabulary object detection, VG, and phrase localization frameworks. Surprisingly, we discovered that state-of-the-art methods often falter in diverse scenarios. Consequently, we developed a novel framework that integrates two critical components: Text-Image Query Selection and Language-Guided Feature Attention. These modules are designed to bolster the recognition of novel categories and enhance the alignment between visual and linguistic information. Extensive experiments demonstrate the efficacy of our proposed framework, which consistently attains SOTA performance across the OV-VG task. Additionally, ablation studies provide further evidence of the effectiveness of our innovative models. Codes and datasets will be made publicly available at https://github.com/cv516Buaa/OV-VG

    Transformer-Based Visual Segmentation: A Survey

    Full text link
    Visual segmentation seeks to partition images, video frames, or point clouds into multiple segments or groups. This technique has numerous real-world applications, such as autonomous driving, image editing, robot sensing, and medical analysis. Over the past decade, deep learning-based methods have made remarkable strides in this area. Recently, transformers, a type of neural network based on self-attention originally designed for natural language processing, have considerably surpassed previous convolutional or recurrent approaches in various vision processing tasks. Specifically, vision transformers offer robust, unified, and even simpler solutions for various segmentation tasks. This survey provides a thorough overview of transformer-based visual segmentation, summarizing recent advancements. We first review the background, encompassing problem definitions, datasets, and prior convolutional methods. Next, we summarize a meta-architecture that unifies all recent transformer-based approaches. Based on this meta-architecture, we examine various method designs, including modifications to the meta-architecture and associated applications. We also present several closely related settings, including 3D point cloud segmentation, foundation model tuning, domain-aware segmentation, efficient segmentation, and medical segmentation. Additionally, we compile and re-evaluate the reviewed methods on several well-established datasets. Finally, we identify open challenges in this field and propose directions for future research. The project page can be found at https://github.com/lxtGH/Awesome-Segmenation-With-Transformer. We will also continually monitor developments in this rapidly evolving field.Comment: Work in progress. Github: https://github.com/lxtGH/Awesome-Segmenation-With-Transforme

    Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation

    No full text
    &lt;p&gt;Open Vocabulary Instance Segmentation&lt;/p&gt;</jats:p

    Recent applications and chiral separation development based on stationary phases in open tubular capillary electrochromatography (2019–2022)

    No full text
    Capillary electrochromatography (CEC) plays a significant role in chiral separation via the double separation principle, partition coefficient difference between the two phases, and electroosmotic flow-driven separation. Given the distinct properties of the inner wall stationary phase (SP), the separation ability of each SP differs from one another. Particularly, it provides large room for promising applications of open tubular capillary electrochromatography (OT-CEC). We divided the OT-CEC SPs developed over the past four years into six types: ionic liquids, nanoparticle materials, microporous materials, biomaterials, non-nanopolymers, and others, to mainly introduce their characteristics in chiral drug separation. There also added a few classic SPs that occurred within ten years as supplements to enrich the features of each SP. Additionally, we discuss their applications in metabolomics, food, cosmetics, environment, and biology as analytes in addition to chiral drugs. OT-CEC plays an increasingly significant role in chiral separation and may promote the development of capillary electrophoresis (CE) combined with other instruments in recent years, such as CE with mass spectrometry (CE/MS) and CE with ultraviolet light detector (CE/UV)
    corecore